11 research outputs found

    Improved contact prediction in proteins: Using pseudolikelihoods to infer Potts models

    Full text link
    Spatially proximate amino acids in a protein tend to coevolve. A protein's three-dimensional (3D) structure hence leaves an echo of correlations in the evolutionary record. Reverse engineering 3D structures from such correlations is an open problem in structural biology, pursued with increasing vigor as more and more protein sequences continue to fill the data banks. Within this task lies a statistical inference problem, rooted in the following: correlation between two sites in a protein sequence can arise from firsthand interaction but can also be network-propagated via intermediate sites; observed correlation is not enough to guarantee proximity. To separate direct from indirect interactions is an instance of the general problem of inverse statistical mechanics, where the task is to learn model parameters (fields, couplings) from observables (magnetizations, correlations, samples) in large systems. In the context of protein sequences, the approach has been referred to as direct-coupling analysis. Here we show that the pseudolikelihood method, applied to 21-state Potts models describing the statistical properties of families of evolutionarily related proteins, significantly outperforms existing approaches to the direct-coupling analysis, the latter being based on standard mean-field techniques. This improved performance also relies on a modified score for the coupling strength. The results are verified using known crystal structures of specific sequence instances of various protein families. Code implementing the new method can be found at http://plmdca.csc.kth.se/.Comment: 19 pages, 16 figures, published versio

    Inverse Ising inference using all the data

    Full text link
    We show that a method based on logistic regression, using all the data, solves the inverse Ising problem far better than mean-field calculations relying only on sample pairwise correlation functions, while still computationally feasible for hundreds of nodes. The largest improvement in reconstruction occurs for strong interactions. Using two examples, a diluted Sherrington-Kirkpatrick model and a two-dimensional lattice, we also show that interaction topologies can be recovered from few samples with good accuracy and that the use of l1l_1-regularization is beneficial in this process, pushing inference abilities further into low-temperature regimes.Comment: 5 pages, 2 figures. Accepted versio

    PoGOLite - A High Sensitivity Balloon-Borne Soft Gamma-ray Polarimeter

    Full text link
    We describe a new balloon-borne instrument (PoGOLite) capable of detecting 10% polarisation from 200mCrab point-like sources between 25 and 80keV in one 6 hour flight. Polarisation measurements in the soft gamma-ray band are expected to provide a powerful probe into high-energy emission mechanisms as well as the distribution of magnetic fields, radiation fields and interstellar matter. At present, only exploratory polarisation measurements have been carried out in the soft gamma-ray band. Reduction of the large background produced by cosmic-ray particles has been the biggest challenge. PoGOLite uses Compton scattering and photo-absorption in an array of 217 well-type phoswich detector cells made of plastic and BGO scintillators surrounded by a BGO anticoincidence shield and a thick polyethylene neutron shield. The narrow FOV (1.25msr) obtained with well-type phoswich detector technology and the use of thick background shields enhance the detected S/N ratio. Event selections based on recorded phototube waveforms and Compton kinematics reduce the background to that expected for a 40-100mCrab source between 25 and 50keV. A 6 hour observation on the Crab will differentiate between the Polar Cap/Slot Gap, Outer Gap, and Caustic models with greater than 5 sigma; and also cleanly identify the Compton reflection component in the Cygnus X-1 hard state. The first flight is planned for 2010 and long-duration flights from Sweden to Northern Canada are foreseen thereafter.Comment: 11 pages, 11 figures, 2 table

    Detecting contacts in protein folds by solving the inverse Potts problem - a pseudolikelihood approach

    No full text
    Abstract  Spatially proximate amino acid positions in a protein tend to co-evolve, so a protein's 3D-structure leaves an echo of correlations in the evolutionary record. Reverse engineering 3D-structures from such correlations is an open problem in structural biology, pursued with increasing vigor as new protein sequences continue to fill the data banks. Within this task lies a statistical stumbling block, rooted in the following: correlation between two amino acid positions can arise from firsthand interaction, but also be network-propagated via intermediate positions; observed correlation is not enough to guarantee proximity. The remedy, and the focus of this thesis, is to mathematically untangle the crisscross of correlations and extract direct interactions, which enables a clean depiction of co-evolution among the positions. Recently, analysts have used maximum-entropy modeling to recast this cause-and-effect puzzle as parameter learning in a Potts model (a kind of Markov random field). Unfortunately, a computationally expensive partition function puts this out of reach of straightforward maximum-likelihood estimation. Mean-field approximations have been used, but an arsenal of other approximate schemes exists. In this work, we re-implement an existing contact-detection procedure and replace its mean-field calculations with pseudo-likelihood maximization. We then feed both routines real protein data and highlight differences between their respective outputs. Our new program seems to offer a systematic boost in detection accuracy
    corecore